Search | WHO COVID-19 Research Database

McAN: a novel computational algorithm and platform for constructing and visualizing haplotype networks.

Li, Lun; Xu, Bo; Tian, Dongmei; Wang, Anke; Zhu, Junwei; Li, Cuiping; Li, Na; Zhao, Wei; Shi, Leisheng; Xue, Yongbiao; Zhang, Zhang; Bao, Yiming; Zhao, Wenming; Song, Shuhui.

Brief Bioinform ; 2023 May 12.

Article in English | MEDLINE | ID: covidwho-2316531

ABSTRACT

Haplotype networks are graphs used to represent evolutionary relationships between a set of taxa and are characterized by intuitiveness in analyzing genealogical relationships of closely related genomes. We here propose a novel algorithm termed McAN that considers mutation spectrum history (mutations in ancestry haplotype should be contained in descendant haplotype), node size (corresponding to sample count for a given node) and sampling time when constructing haplotype network. We show that McAN is two orders of magnitude faster than state-of-the-art algorithms without losing accuracy, making it suitable for analysis of a large number of sequences. Based on our algorithm, we developed an online web server and offline tool for haplotype network construction, community lineage determination, and interactive network visualization. We demonstrate that McAN is highly suitable for analyzing and visualizing massive genomic data and is helpful to enhance the understanding of genome evolution. Availability: Source code is written in C/C++ and available at https://github.com/Theory-Lun/McAN and https://ngdc.cncb.ac.cn/biocode/tools/BT007301 under the MIT license. Web server is available at https://ngdc.cncb.ac.cn/bit/hapnet/. SARS-CoV-2 dataset are available at https://ngdc.cncb.ac.cn/ncov/. Contact: songshh@big.ac.cn (Song S), zhaowm@big.ac.cn (Zhao W), baoym@big.ac.cn (Bao Y), zhangzhang@big.ac.cn (Zhang Z), ybxue@big.ac.cn (Xue Y).

Towards comprehensive integration and curation of chloroplast genomes.

Hua, Zhongyi; Tian, Dongmei; Jiang, Chao; Song, Shuhui; Chen, Ziyuan; Zhao, Yuyang; Jin, Yan; Huang, Luqi; Zhang, Zhang; Yuan, Yuan.

Plant Biotechnol J ; 20(12): 2239-2241, 2022 Dec.

Article in English | MEDLINE | ID: covidwho-2137142

Subject(s)

Genome, Chloroplast , Genome, Chloroplast/genetics

Genomic Epidemiology of SARS-CoV-2 in Pakistan.

Song, Shuhui; Li, Cuiping; Kang, Lu; Tian, Dongmei; Badar, Nazish; Ma, Wentai; Zhao, Shilei; Jiang, Xuan; Wang, Chun; Sun, Yongqiao; Li, Wenjie; Lei, Meng; Li, Shuangli; Qi, Qiuhui; Ikram, Aamer; Salman, Muhammad; Umair, Massab; Shireen, Huma; Batool, Fatima; Zhang, Bing; Chen, Hua; Yang, Yun-Gui; Abbasi, Amir Ali; Li, Mingkun; Xue, Yongbiao; Bao, Yiming.

Genomics Proteomics Bioinformatics ; 19(5): 727-740, 2021 10.

Article in English | MEDLINE | ID: covidwho-1474586

ABSTRACT

COVID-19 has swept globally and Pakistan is no exception. To investigate the initial introductions and transmissions of the SARS-CoV-2 in Pakistan, we performed the largest genomic epidemiology study of COVID-19 in Pakistan and generated 150 complete SARS-CoV-2 genome sequences from samples collected from March 16 to June 1, 2020. We identified a total of 347 mutated positions, 31 of which were over-represented in Pakistan. Meanwhile, we found over 1000 intra-host single-nucleotide variants (iSNVs). Several of them occurred concurrently, indicating possible interactions among them or coevolution. Some of the high-frequency iSNVs in Pakistan were not observed in the global population, suggesting strong purifying selections. The genomic epidemiology revealed five distinctive spreading clusters. The largest cluster consisted of 74 viruses which were derived from different geographic locations of Pakistan and formed a deep hierarchical structure, indicating an extensive and persistent nation-wide transmission of the virus that was probably attributed to a signature mutation (G8371T in ORF1ab) of this cluster. Furthermore, 28 putative international introductions were identified, several of which are consistent with the epidemiological investigations. In all, this study has inferred the possible pathways of introductions and transmissions of SARS-CoV-2 in Pakistan, which could aid ongoing and future viral surveillance and COVID-19 control.

Subject(s)

COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Genome, Viral , Genomics , Humans , Pakistan/epidemiology , Phylogeny , SARS-CoV-2/genetics

The Global Landscape of SARS-CoV-2 Genomes, Variants, and Haplotypes in 2019nCoVR.

Song, Shuhui; Ma, Lina; Zou, Dong; Tian, Dongmei; Li, Cuiping; Zhu, Junwei; Chen, Meili; Wang, Anke; Ma, Yingke; Li, Mengwei; Teng, Xufei; Cui, Ying; Duan, Guangya; Zhang, Mochen; Jin, Tong; Shi, Chengmin; Du, Zhenglin; Zhang, Yadong; Liu, Chuandong; Li, Rujiao; Zeng, Jingyao; Hao, Lili; Jiang, Shuai; Chen, Hua; Han, Dali; Xiao, Jingfa; Zhang, Zhang; Zhao, Wenming; Xue, Yongbiao; Bao, Yiming.

Genomics Proteomics Bioinformatics ; 18(6): 749-759, 2020 12.

Article in English | MEDLINE | ID: covidwho-987765

ABSTRACT

On January 22, 2020, China National Center for Bioinformation (CNCB) released the 2019 Novel Coronavirus Resource (2019nCoVR), an open-access information resource for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates, which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline. Of particular note, 2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale. It provides all identified variants and their detailed statistics for each virus isolate, and congregates the quality score, functional annotation, and population frequency for each variant. Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available. Moreover, 2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019 (COVID-19), including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC. Furthermore, by linking with relevant databases in CNCB, 2019nCoVR offers data submission services for raw sequence reads and assembled genomes, and data sharing with NCBI. Collectively, SARS-CoV-2 is updated daily to collect the latest information on genome sequences, variants, haplotypes, and literature for a timely reflection, making 2019nCoVR a valuable resource for the global research community. 2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.

Subject(s)

COVID-19 , SARS-CoV-2 , Genome, Viral , Genomics , Haplotypes , Humans

The 2019 novel coronavirus resource.

Zhao, Wen-Ming; Song, Shu-Hui; Chen, Mei-Li; Zou, Dong; Ma, Li-Na; Ma, Ying-Ke; Li, Ru-Jiao; Hao, Li-Li; Li, Cui-Ping; Tian, Dong-Mei; Tang, Bi-Xia; Wang, Yan-Qing; Zhu, Jun-Wei; Chen, Huan-Xin; Zhang, Zhang; Xue, Yong-Biao; Bao, Yi-Ming.

Yi Chuan ; 42(2): 212-221, 2020 Feb 20.

Article in English | MEDLINE | ID: covidwho-3031

ABSTRACT

An ongoing outbreak of a novel coronavirus infection in Wuhan, China since December 2019 has led to 31,516 infected persons and 638 deaths across 25 countries (till 16:00 on February 7, 2020). The virus causing this pneumonia was then named as the 2019 novel coronavirus (2019-nCoV) by the World Health Organization. To promote the data sharing and make all relevant information of 2019-nCoV publicly available, we construct the 2019 Novel Coronavirus Resource (2019nCoVR, https://bigd.big.ac.cn/ncov). 2019nCoVR features comprehensive integration of genomic and proteomic sequences as well as their metadata information from the Global Initiative on Sharing All Influenza Data, National Center for Biotechnology Information, China National GeneBank, National Microbiology Data Center and China National Center for Bioinformation (CNCB)/National Genomics Data Center (NGDC). It also incorporates a wide range of relevant information including scientific literatures, news, and popular articles for science dissemination, and provides visualization functionalities for genome variation analysis results based on all collected 2019-nCoV strains. Moreover, by linking seamlessly with related databases in CNCB/NGDC, 2019nCoVR offers virus data submission and sharing services for raw sequence reads and assembled sequences. In this report, we provide comprehensive descriptions on data deposition, management, release and utility in 2019nCoVR, laying important foundations in aid of studies on virus classification and origin, genome variation and evolution, fast detection, drug development and pneumonia precision prevention and therapy.

Subject(s)

Betacoronavirus , Coronavirus Infections/epidemiology , Databases, Genetic , Information Dissemination , Pneumonia, Viral/epidemiology , Pneumonia, Viral/virology , COVID-19 , China , Coronavirus , Coronavirus Infections/virology , Genomics , Humans , Pandemics , Proteomics , SARS-CoV-2

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL